Functional analysis of MMR gene VUS from potential Lynch syndrome patients

Lynch syndrome is caused by inactivating variants in DNA mismatch repair genes, namely MLH1, MSH2, MSH6 and PMS2. We have investigated five MLH1 and one MSH2 variants that we have identified in Turkish and Tunisian colorectal cancer patients. These variants comprised two small deletions causing frameshifts resulting in premature stops which could be classified pathogenic (MLH1 p.(His727Profs*57) and MSH2 p.(Thr788Asnfs*11)), but also two missense variants (MLH1 p.(Asn338Ser) and p.(Gly181Ser)) and two small, in-frame deletion variants (p.(Val647-Leu650del) and p.(Lys678_Cys680del)). For such small coding genetic variants, it is unclear if they are inactivating or not. We here provide clinical description of the variant carriers and their families, and we performed biochemical laboratory testing on the variant proteins to test if their stability or their MMR activity are compromised. Subsequently, we compared the results to in-silico predictions on structure and conservation. We demonstrate that neither missense alteration affected function, while both deletion variants caused a dramatic instability of the MLH1 protein, resulting in MMR deficiency. These results were consistent with the structural analyses that were performed. The study shows that knowledge of protein function may provide molecular explanations of results obtained with functional biochemical testing and can thereby, in conjunction with clinical information, elevate the evidential value and facilitate clinical management in affected families.


Introduction
Colorectal cancer accounts for about 10% of all cancers diagnosed each year and cancerrelated deaths worldwide [1].It is the second most common type of cancer diagnosed in women and the third most common type of cancer in men.The global incidence of colorectal cancer is expected to rise to 2.5 million new cases by 2035 [2].
It has a prevalence of approximately 7 per 100.000people in Turkey, with approximately 5000 new cases and 3200 deaths each year [3].It is also a serious public health issue in Tunisia, according to the International Agency for Research on Cancer.The ASR (Age Standardized incidence Rate) was 10.9 per 100.000 in 2012, which is a low to medium rate [4].The predicted CRC ASR would be 39.3/100,000 [CI 95%: 32,9/100,000-48,8/100,000] in 2024 [4].
Somatic loss of the WT allele of one of the MMR genes causes cellular deficiency of the affected protein, which can be detected by immune-histochemistry of the tumor tissue.It results in MMR deficiency, resulting in the molecular tumor phenotype of microsatellite instability (MSI) which can be detected by PCR of tumor tissue.The replicative DNA polymerase is unable to correct errors which is thought to be a critical mechanism implicated in the development of Lynch-associated cancers by causing a spontaneous "mutator phenotype" in affected cells [8].
The MSI phenotype is not limited to Lynch syndrome, and can be found in approximately 15% of sporadic colorectal cancers, which most frequently show somatic loss of MLH1 due to promoter hyper-methylation, accompanied by a BRAF V600E variant [9].
Lynch syndrome patients are at an increased risk to develop cancers other than CRC such as endometrial cancer, cancers of the small bowel, stomach, ovaries, renal pelvis, ureter, and hepatobiliary system [9].
The majority of germline variants are detected in the MLH1 gene (50%) followed by the MSH2 gene (40%), with only 10% found in the MSH6 and PMS2 genes [10].A high proportion of these are of unknown clinical significance [11,12].They are termed variants of unclear significance (VUS) or unclassified variants (UV).For properly targeted cancer surveillance in carrier families, the variants require classification, and they must be classified as pathogenic [13].Without classification as pathogenic, diagnosis cannot be made, and relatives of patients are unable to receive predictive testing and targeted preventive surveillance.Thus, determining the pathogenicity of these increasingly common variants in cancer-predisposing genes presents a significant challenge to clinical geneticists.It is critical to determine which variants in the MMR genes are involved in pathogenesis [14].
The Variant Interpretation Committee (VIC) of the International Society for Gastrointestinal Hereditary Tumors (InSiGHT) used standards established by the International Agency for Research on Cancer (IARC) and has evaluated qualitative or quantitative integration of evidence to classify variants (https://www.insight-group.org/)[15,16].The VIC has reclassified some MMR gene VUS as clinically pathogenic (class 5, with probability of pathogenicity >0.99; or class 4, with probability of pathogenicity >0.95) or clinically benign (class 1, with probability of pathogenicity <0.001); or class 2, with probability of pathogenicity <0.05) [15] with clinical recommendations [16,17].
Segregation analysis in families, population allele frequencies, and tumor pathology are all common methods for analyzing variants in MMR and other cancer predisposition-associated genes, but frequently do not provide sufficient evidential value for a classification.Therefore, functional analysis provides a valuable additional tool to facilitate classification 14].Structural analysis may further corroborate the conclusions deduced from the functional analysis, specifically if discreet functions can be associated with the analyzed residues [14,18,19].
To improve Lynch syndrome diagnosis in our patients, we performed MLH1 functional analysis for the first time for variants identified in Turkish patients, similar as before for Tunisian patients [14], using a clinically calibrated assay [19].
We investigated four MLH1 variations, two of which are missense and two are in-frame deletions of three and four residues, found in colorectal cancer patients.We show that the results of these functional analyses are consistent with data obtained from residue conservation and protein structure analysis.This information, when combined with the clinical data provided for the carrier patients, suggests that two variants are likely causative for Lynch syndrome and two others are likely benign.The provided information and results will facilitate more definite classification of the investigated variants, thereby predictive testing for family members and better targeted surveillance measures and lifestyle counseling in affected patients in Tunisia, Turkey and elsewhere [20].

Patients
This study includes both unpublished and published data from three Turkish and three Tunisian patients who met the Amsterdam or the Bethesda criteria.In these patients, germline DNA sequencing was performed, as well as tumor testing (microsatellite instability analysis and immunohistochemistry as indicated).Family cancer histories were obtained [21].Samples and clinical data were anonymized prior to analysis.The data were accessed for research purposes for Tunisian patients on May 21st, 2021 and for Turkish Patients on June 5th, 2022.
The study was carried out in accordance with the Helsinki Declaration, and it was approved by the Ethics and Research Committee of Farhat Hached University Hospital, Sousse, Tunisia on May 10th, 2021 (OHRP IRB 00008931).The Ethical Committee of Trakya University Faculty of Medicine, Edirne, Turkey declared that its approval was not required for this study.During genetic counseling sessions, all patients were informed about their inclusion in the registries, and written informed consent was obtained from all participants.

Next generation Sequencing (NGS)
Turkish patients have been sequenced in the medical genetics department of Trakya University Hospital in Turkey using the TruSight 1 Cancer Sequencing Panel (Illumina) and Qiaseq Targeted DNA Panel (Qiagen) according to the manufacturers' instructions for NGS.The pooled and barcoded libraries were subsequently sequenced using NextSeq sequencer (Illumina Inc.) [22].
Two of the Tunisian patients were sequenced in the laboratory of Molecular and Cellular Screening Processes, Center of Biotechnology of Sfax in Tunisia using Miseq sequencer (Illumina).One Tunisian patient was sequenced in the Pathology Department of Leiden University Medical Center (LUMC) in the Netherlands as previously described [21].Briefly, DNA was sequenced with the Ion Proton System (Life Technologies, Carlsbad, CA, USA) using a custom MMR panel [23].Libraries were prepared with Ion AmpliSeq™ Library Kit 2.0 according to the manufacturer's protocol.The Proton sequencer generated unaligned BAM which were mapped against the human reference genome (GRCh37/hg19) using the TMAP 5.0.7 software with default parameters (https://github.com/iontorrent/TS)[21].

Nomenclature and classification of genetic variants
The nomenclature guidelines of the Human Genome Variation Society (HGVS) were used to describe the detected genetic variants [24].Mutalyzer [25] was used to validate the nomenclature of all variants.The recurrence of the identified variants was determined by interrogating four databases: the Leiden Open Variation Database (LOVD), ClinVar, and the Human Gene Mutation Database (HGMD).The InSiGHT database was used to check for current classifications of the variants [15].

Cell lines
HEK293T cells (deficient for endogenous MLH1 and PMS2) [26] were kindly provided by Prof. Jiricny, Zurich, Switzerland.Their identity was confirmed by comparing their genomic short tandem repeat (STR) profile from 9 loci to the source HEK293T cell line DSMZ ACC 635, and then by a variable number of tandem repeats (VNTR) profile from the Leibnitz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany, in 02/2009 and 06/2018.They were also routinely tested for mycoplasma and verified by morphology, growth curve analysis, and expression of proteins (e.g., MLH1 and PMS2).
The cells used in this work were freshly thawed from frozen aliquots of these verified batches.They were grown in Dulbecco's Modified Eagle Medium (Invitrogen) with 10% fetal calf serum (PAA Laboratories) and 1% penicillin-streptomycin (Sigma) [25].

Protein expression and quantification
The HEK293T cell line, pcDNA3-MLH1, and pSG5-PMS2 have previously been described [26].The Q5 Site directed mutagenesis system (New England Biolabs, Frankfurt, Germany) was used to generate missense variants with appropriate primes according to the manufacturer's protocols.Direct sequencing was used to confirm all of the plasmids that were created [14].
The extracts were examined using SDS-PAGE and immunoblotting with anti-MLH1, G168-728 from BD Biosciences, as well as anti-PMS2, E-19, and anti-beta-Actin, C2 from Santa Cruz Biotechnologies.A Fuji LAS-4000 mini camera and Multi Gauge v3.2 were used to detect and quantify chemiluminescence signals (Immobilon, Millipore) [14].

Evaluation of expression defects with respect to pathogenicity
Protein expression and quantification were performed in parallel with a neutral control variant with impaired stability (MLH1 p.(Val716Met)) and a pathogenic control variant with severe destabilization (MLH1 p.(Ala681Thr)).When the expression of the variant in question is similar to or lower than that of the pathogenic control variant, there is a clinically pathogenic protein stability defect [28,29].

MMR activity
A validated procedure was used to assess the in vitro MMR activity of MLH1 variants, yielding clinically meaningful results [28,30].Protein extracts were mixed with 35 ng of DNA substrate containing a G-T mismatch and a 83-bp single-strand nick.The DNA substrate was purified and digested with EcoRV and AseI after incubation at 37˚C.The restriction fragments were separated in agarose gels and then analyzed using GelDoc XR plus detection and QuantityOne software (Bio-Rad).The repair efficiency (e) was calculated as follows: e = (intensity of repaired substrate bands)/ (intensity of all bands of substrate).The amount of DNA recovered during plasmid purification has no effect on this outcome.Total repair efficiencies typically ranged between 50 and 90%.The repair efficiency of MLH1 variants was calculated as e (relative) = e (variant)/e (wild type) * 100 in comparison to a wild-type protein produced in parallel [14].

Structural and bioinformatic analyses
An alignment containing >900 non-redundant full-length sequences of eukaryotic MLH1 proteins was generated to assess residue conservation using manually curated BLAST hits obtained by interrogating the human MLH1 protein reference sequence NP 00240.1.Multiple hits from the same organism were reduced to one, and non-MLH1 sequences were identified and removed due to the lack of a highly conserved C-terminal FERC motif [31,32].

Clinical characteristics of affected carriers of identified MLH1 and MSH2 variants
Six variants identified in six index patients meeting the Amsterdam II and or the Bethesda II guidelines were chosen for analysis (Table 1) [38,39].
Three of the variants were found in Tunisian patients.Two of them are novel variants which encode insertions leading to a frameshift and premature stop codons in MLH1 (c.2180delA, corresponding to p.(His727Profs*57)) and MSH2 (c.2362_2363insA, corresponding to p.(Thr788Asnfs*11)).These were excluded from further functional analysis in this study since they can be classified as pathogenic without further efforts (see below).The third Tunisian patient carried a small in-frame deletion in MLH1 (c.1940_1951del, corresponding to p. (Val647_Leu650del)) [21].The clinical significance of this variant is unclear.
Family data was acquired for all affected patients (Fig 1).Additional cancer cases were present in all families.Most of the tumors in index cases were colorectal, showing a loss of MLH1 protein expression except for the carrier of variant p.(Gly181Ser) for whom immunohistochemistry showed a normal expression of the MMR proteins (Table 1).
This patient (Family TURK-1, PIII.4) was diagnosed at the age of 54 with colon cancer in stage III.His paternal aunt also suffered from the same type of cancer; his son developed a stomach cancer and other members in his family developed lung cancer.
The variant p.(Asn338Ser) was discovered in family TURK-2, in which 5 members (PII.11,PII.12, PIII.6, PIII.17 and PIII.19) were affected by CRC.DNA sequencing was carried out for only one individual of them (PIII.6),who had rectum cancer at the age of 62.The tumor of this index case showed a normal expression of MLH1 and PMS2 proteins.Two other family members were diagnosed with stomach cancer (PIII.1 and PIII.2), one (PII.17)affected by laryngeal cancer and another one was affected by bladder cancer (PIII.5).In TURK-3, the index case (PII.3),who is a confirmed carrier of the variant p.(Lys678_-Cys680del), had synchronous colon and endometrium cancers.She was diagnosed at the age of 62. Her CRC showed a loss of MLH1 and PMS2 protein expression in immunohistochemistry. Her father and her brother were also diagnosed with colon cancer but they were not tested.
Patients PIII.3 and PIII.6 (two sisters from a consanguineous marriage) of the family TUN-1 were found to carry the same variant, p.(Val647_Leu650del).They developed colorectal cancer at the ages of 42 and 44, respectively.Both showed loss of MLH1 and PMS2 protein expression and loss of heterozygosity with retention of the variant in their tumors.Their paternal cousin also has CRC, while their mother had uterine cancer [21].
The variant p.(Thr788Asnfs*11) was identified in patient (PIII.5) of family TUN-2.This index case suffered from the Muir-Torre syndrome with multiple sebaceous adenomas associated with the development of polyps discovered after analysis of the operating piece of rightcolonic carcinoma that was operated at the age of 57 years.His two sisters (PIII.1)and (PIII.2) developed gastric and colonic cancers separately at the age of 45 and 60, respectively.Their father died presumably of a cancer disease.Finally, in the family TUN-3, we found the variant p.(His727Profs*57) in the index case (PIII.5)who developed a colonic tumor at the age of 26 and underwent a right hemi-colectomy one year later.This variant was also identified in his brother (PIII.4),who was addressed to our center for a genetic counseling.At the time of consultation, he was 33 years old and did not exhibit any symptoms associated with the variant.Most likely, Lynch syndrome was inherited from their father (PII.8)who developed caecum cancer at the age of 46 and underwent a right hemi-colectomy; unfortunately, this patient was not available for genetic testing.This family is positive for the Amsterdam II criteria.
In contrast, the available clinical data are insufficient to classify the small coding variants: it was not possible to perform additional genetic analyses in all relatives of patients to assess cosegregation, which is a highly reliable method for the assessment of pathogenicity.Moreover, no sufficient information on molecular tumor traits (microsatellite instability and BRAF status) was available for pathogenicity clarification.Consequently, it is unclear if the described small coding variants are pathogenic, and if the carrier patients indeed have Lynch syndrome.For assessing pathogenicity, we therefore set out to assemble further informative knowledge on alternative ways and performed functional testing of the genetic variants in vitro.

Evaluation of protein stability and MMR efficiency of MLH1 variants by functional analysis
We performed functional analyses on the four small coding variants to determine if the identified alterations cause a defect of function in the protein; if this were the case, a causal involvement in the observed cancer cases is most likely.
Small coding variants frequently cause functional defects in MLH1 by destabilizing the protein or by interfering with DNA repair activity [28,32,40,41].Protein stability decreases lead to lower intracellular protein levels, which, even if the protein is functional, may result in DNA mismatch repair deficiency if it falls below a certain threshold [30].We used previously established reference variants to translate expression defects into pathogenicity statements [30].Of these, the MLH1 p.(Ala681Thr) pathogenic Lynch syndrome reference variant served to identify variants whose destabilization is severe enough to confer a pathogenic effect in humans due to low cellular protein levels.Furthermore, the neutral, slightly destablized clinically neutral polymorphism MLH1 p.(Val716-Met) was used as a reference for clinically neutral stability defects [30].We thoroughly compared the expression levels of all variants to the wild-type MLH1 protein and the two reference variants (Fig 2A).The outcomes of several independent experiments were summarized (Fig 2B).
Two variants, p.(Gly181Ser) and p.(Asn338Ser), were as strongly expressed as the wild-type protein and thus do not exhibit clinically relevant stability issues, as evidenced by expression of the p.(Val716Met) reference variant (Fig 2B).The T-test revealed no statistically significant difference between them and the wild-type protein (P>0.05).
In contrast, the variants p.(Lys678_Cys680del) and p.(Val647_Leu650del) showed significantly compromised stability at the limit of detection, the decrease was much stronger than that of the pathogenic reference variant p.(Ala681Thr).
A pathogenic defect can be concluded for these highly destabilized variants because insufficient MLH1 protein is present in the cell [30].
Another major reason that small coding MLH1 variants may be pathogenic is that they confer catalytic inactivity of the variant MLH1 protein.Since the primary function of MLH1 is to aid in DNA mismatch repair, we tested the variants' ability to perform the repair reaction in vitro (Fig 3A).Several independent experiments were carried out to validate the findings (Fig 3B ).
The p.(Gly181Ser) and p.(Asn338Ser) variants had MMR efficiency comparable to the wildtype, whereas the p.(Lys678_Cys680del) and p.(Val647_Leu650del) variants were approximately similar to the negative control and had therefore completely lost repair activity.
Taken together, in this study, functional assays revealed that p.(Lys678_Cys680del) and p. (Val647_Leu650del) are variants that confer a functional defect on the MLH1 protein.However, the functionality of the variants p.(Gly181Ser) and p.(Asn338Ser) was, in the applied assay systems, indistinguishable from that of wild-type.

Conservation and structural roles of the affected residues
The analysis of the structural role and the conservation of variant residues can help to explain the findings of functional studies.This can represent confirmatory evidence if the gathered information is in good agreement with the functional observations.We therefore evaluated the conservation and positions of the affected residues within the structure of the MLH1-PMS2 heterodimer.The glycine at codon 181 is replaced by serine, an amino acid with similar biochemical properties.Gly181 is located at the backside β-sheet of the ATPase pocket, but directed .All these observations strongly suggest that Gly181Ser is a neutral substitution, which is in perfect agreement with the results of the biochemical analyses.
Asn338 is located at the very beginning (first residue) of the unstructured linker that connects the N-and C-terminal domains (Fig 4).The linker is in the largest part neither conserved nor structured and rarely contains pathogenic alterations [42].An exception is represented by a small, functionally highly relevant motif which is in sufficient distance to Asn338 (in Fig 4 depicted as ConMot) [18].The linker also shows some conservation in the shown N-terminal region following the α-helix and Asn338 (Fig 4, left panel).For example, there is a strongly conserved arginine, which is a frequently observation at the C-termini of α-helices.They are often involved in formation of helix capping by forming loops through main chain interactions and compensating helix dipols [43], which is also the case here according to AlphaFold2 structure predictions.The Asn338 itself shows only intermediate conservation (Fig 4, left panel: N).Besides asparagine (N), aspartate (D) but also serine (S) occur regularly in MLH1 proteins [32], suggesting that it is a tolerated substitution.Additionally, no specific function of the Asn338 residue is obvious from structural analyses.This is in agreement with the biochemical finding that Asn338Ser has a normal protein function like wildtype.
In comparison to missense substitutions, deletions can be expected to confer more dramatic effects on average, since they have a greater potential to distort the structure of a protein.However, if deletions or insertions are located in unstructured areas (intrinsically disordered regions, IDRs), or loops between secondary structural elements, they may also be tolerated and not confer any effect.This we have described before for extensive artificial deletions in the MLH1 linker [18] but also for a small deletion identified in a human cancer patient that is located at the border between a C-terminal helix and a loop [31].
The deletion Val647_Leu650 (VPPL) affects the beginning of an α-helix including a fraction of the preceding loop and comprises the strongly conserved Pro648 residue (Fig 4).This highly conserved proline likely facilitates helix formation and orientation, since it is placed in a typical position [31].The deletion removes this relevant proline and significantly shortens the loop structure required for correct positioning of the helix, thereby destroying the local structure of the MLH1 protein, explaining the instability and the biochemical loss of function.
The deletion Lys678_Cys680 affects the middle of an extended α-helix.While none of the three deleted residues displays significant conservation, the most dramatic effect is conferred by the relative rotation of the conserved hydrophobic residues of the helix (marked yellow in Fig 4).These hydrophobic residues are normally oriented towards one side of the helix, pointing to the proteins' interior and serve to anchor this side of the helix in the hydrophobic core of the protein.As a result of the deletion, the orientation of internal (hydrophobic) and external (hydrophilic) residues is distorted (S1 Fig) .Consequently, it can be expected to have a significant impact on protein structure (and function), as evidenced by the low stability of the variant protein.
Thus, on summary, there is a high consistency of the functional results and the information deduced from residue conservation and role in protein structure.

Discussion
In this study, we report identification, description and evaluation of five MLH1 and one MSH2 variants found in three Turkish and four Tunisian patients with suspected Lynch syndrome.The identified variants have either not been reported before or have no informative classification of their clinical effects, forestalling predictive testing and targeted surveillance in affected families (Table 1).Like in many similar cases, even when summarizing the available evidence from our current investigation and others, it remains unclear if these protein variants are functional, and clinical data is insufficient for pathogenicity classification.We have therefore performed functional and structural analyses for providing additional lines of evidence concerning their biological and clinical effects.For functional investigation, we utilized an standardized assay [14] that uses clinically established reference variants for pathogenicity assessment and [14,44].While the methodologically similar CIMRA assay directly enables calculating of posterior probabilities of pathogenicity [45], this procedure provides better data on the molecular reasons of functional loss.
Of the examined variants, two variants could be classified straightforwardly as pathogenic (class 5) by application of the criteria provided by the variant interpretation committee (VIC) of the International Society for Gastrointestinal Hereditary Tumours (InSiGHT) (https:// www.insight-group.org/criteria/),since their expected result is a premature translational stop, resulting in a truncated, non-functional protein (Table 2).The ACMG classification [46] is concordant for the MSH2 truncating variant which also includes expert panel reviews (S1 Table) [15,[47][48][49].However, the ACMG classification scheme is more cautious for the MLH1 truncating variant since the truncation removes only a comparatively small C-terminal fraction of the protein (S1 Table ).However, this fraction contains important, highly conserved functional sequence from the MLH1 protein indispensable for function, and truncations starting from position MLH1 Tyr750 have been shown to be deleterious before [18,19].Consistent with this, the affected family (TUN-3) fulfills the Amsterdam II criteria (Fig 1).
Concerning the two other missense VUS variants and two in-frame alterations, one is novel and the other three are currently classified as variants of uncertain significance (Table 1).
For the two deletion variants, we could show loss of expression compatible with a diseasecausing defect using a clinically calibrated MMR assay system [30].All results of this assay system have yet been consistent with variant classifications where available (see S1 Table of [14]), confirming its reliability.Moreover, both deletions destroy secondary structures, and they are located within a domain for which we have shown before that alterations make the protein susceptible to destabilization [30].This is well consistent with the absence of MLH1 in the patients' tumors in IHC (Table 1).Both variants currently data lack for a final classification in class 4 (likely pathogenic) according to current InSiGHT criteria for variant classification (Table 2).However, the AMCG guidelines allow the conclusion that the variants are pathogenic (Table 1, S1 Table ).
The findings for the missense variants (p.(Gly181Ser) and p.(Asn338Ser)) supported neutrality: there were no detectable defects in functionality.These variants retained a sufficiently high expression level that is above the clinical reference variant for proficient stability (p.(Val716Met)), and catalytic activity was not compromised by the alterations.These observations are consistent with the low conservation of the residues and the occurrence of the substitutions in evolution, and also consistent with their very low MAPP+PolyPhen-2 prior probability of pathogenicity (Table 1).However, providing functional evidence for neutrality is more complex since it requires to exclude all potential sources of defects.Both variants were proficient in the applied investigations which comprise the most relevant reasons underlying deficiency caused by missense alterations, namely stability and repair activity [30,31]; moreover, sufficient complimentary clinical data are also required according to the InSiGHT criteria but currently insufficient (Table 2).Application of the ACMG guidelines therefore also currently results in uncertain significance (Table 1, S1 Table ).
The finding that the tumor of the index patient showed normal expression of MLH1 and PMS2 is consistent with the normal expression of the p.Asn338Ser variant; it is noteworthy that this tumor displayed loss of MSH2 and MSH6 expression, which would be compatible with an inactivating variant in the MSH2 gene.However, sequencing of the MSH2 gene did not reveal any variants, and MLPA analysis also did not detect any deletions or duplications in this gene.
Taken together, all functional findings suggest that both variant proteins are proficient.This is consistent with the clinical observation that there does not seem to be a predisposition in family TURK-1.In contrast, we did not find evidence that the potential cancer predisposition observed in family TURK-2 (with positive Amsterdam II criteria) is caused by their variant.However, there remains a small possibility that these variants may interfere with other functional aspects of the MLH1 gene, e. g. with mRNA formation or DNA damage response.
In conclusion, at a time when Next-Generation Sequencing may enable the detection of novel and multiple VUS in distinct cancer genes, functional characterization of these VUS will become more important in genetic counseling for these families.
Biochemical function analysis, performed with appropriate controls allowing a clinically meaningful reading of the results, in conjunction with additional evidence, is able to provide a reliable statement of pathogenicity if consistent results are obtained.This approach can therefore fill the gap if other methods of classification do not offer sufficient probative value.

Fig 2 .
Fig 2. Analysis of expression of the MLH1 variants.A, Expression of wild-type and variant MLH1 proteins was visualized by SDS-PAGE and western blotting.The two stability reference variants p.(Ala681Thr) (pathogenic expression defect) and p.(Val716Met) (non-pathogenic expression defect, polymorphism) were transfected in parallel.The shown blots are representative for 3 independent experiments that were performed and delivered the data shown in evaluation (B).B, Average expression values in percent of the wild-type expression and standard deviations are shown for wild-type and variant MLH1 proteins.https://doi.org/10.1371/journal.pone.0304141.g002

Fig 3 .
Fig 3. Analysis of mismatch repair activity of the MLH1 variants.DNA mismatch repair activity was assessed for wild-type and variant MLH1 proteins, and a negative control (without MLH1 protein) was included as detailed in "Materials and Methods".A, Representative agarose gel image of the MMR activity measurement.The extent of repair is visible in the agarose gel electrophoresis by the generation of two smaller fragments ("Repair") of the unrepaired, linearized plasmid ("No repair").B, three independent experiments were performed, and repair activity was scored relative to wild-type MLH1 protein (100%).Average repair values and standard deviations are shown.https://doi.org/10.1371/journal.pone.0304141.g003

Fig 4 .
Fig 4. Affected residues in their structural and conservational context.Structural model of the human MLH1-PMS2 heterodimer (MLH1: green; PMS2: cyan).The N-terminal, structured ATPase regions (bound ATP in ball-stick presentation with magnesium ions as orange spheres) are connected by unstructured linker region (symbolized as a line), including the catalytically relevant, conserved ConMot motif, to the structured C-terminal dimerization and endonuclease domains.The locations of the residues affected by alterations investigated in this study are marked in red in the structure and by red boxes in the sequence logos.Sequence conservation is shown in WebLogo presentation at the left.Yellow shading indicates hydrophobic residues relevant for helix anchoring.Green boxes above the sequence inform about secondary structure that these sequences form.https://doi.org/10.1371/journal.pone.0304141.g004

Table 1 . Clinical, genetic information and functional variant evaluation of the investigated Turkish and Tunisian patients.
1Retrieved in January 2024 from the InSiGHT database (http://www.insight-database.org/classifications/)2Retrieved in January 2024 from the ClinVar database (https://www.ncbi.nlm.nih.gov/clinvar/);numbers in brackets give the number of individual reports3Sequencing and MLPA of MSH2 and MSH6: normal

Table 2 . Classification criteria as applied to the current variants 1 . Criteria for class 5 (pathogenic
) p.(Thr788Asnfs*11) p.(His727Profs*57)Coding sequence variation resulting in a stop codon i.e. a nonsense or a frameshift alteration that is not after codon 743 in MLH1 or after codon 888 in MSH2, and not in the last exon of MSH6 or PMS2.only an excerpt of these criteria applicable to the given cases.+meansthatone criterion is fulfilled, 0 means that for one criterion the data point is lacking.IHC, immunohistochemistry.2Frequency of the SNP rs63751467 in the European population according to the dbSNP database (retrieved September 2023).https://doi.org/10.1371/journal.pone.0304141.t002